[#441] Preload embedding model at startup#461
[#441] Preload embedding model at startup#461sahilds1 wants to merge 7 commits intoCodeForPhilly:developfrom
Conversation
… routed to the application instance
There was a problem hiding this comment.
Pull request overview
This PR aims to ensure the SentenceTransformer embedding model is loaded during Django startup (before traffic hits the instance) and to make the embeddings search logic more testable by factoring it into smaller functions.
Changes:
- Refactors
get_closest_embeddingsby extracting query building, evaluation, and usage logging into helper functions. - Adds pytest + pytest-django support (requirements +
pytest.ini) and new unit tests for embedding service helpers. - Updates GitHub Actions workflow and README to run backend tests.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
server/requirements.txt |
Adds pytest dependencies to support running backend tests. |
server/pytest.ini |
Configures pytest-django settings/module and python path for the server package. |
server/api/services/test_embedding_services.py |
Adds unit tests for query evaluation and usage logging helpers. |
server/api/services/embedding_services.py |
Refactors embeddings search into build_query, evaluate_query, log_usage, and reworks get_closest_embeddings. |
server/api/apps.py |
Attempts to preload the embedding model during Django app initialization via ready(). |
README.md |
Documents how to run backend tests inside the backend container. |
.github/workflows/python-app.yml |
Changes CI branch targets and adds dependency install + pytest execution. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| @@ -32,59 +31,52 @@ def get_closest_embeddings( | |||
| Returns | |||
| ------- | |||
| list[dict] | |||
| List of dictionaries containing embedding results with keys: | |||
| - name: document name | |||
| - text: embedded text content | |||
| - page_number: page number in source document | |||
| - chunk_number: chunk number within the document | |||
| - distance: L2 distance from query embedding | |||
| - file_id: GUID of the source file | |||
| QuerySet | |||
| Unevaluated Django QuerySet ordered by L2 distance, sliced to num_results | |||
| """ | |||
|
|
|||
| encoding_start = time.time() | |||
| transformerModel = TransformerModel.get_instance().model | |||
| embedding_message = transformerModel.encode(message_data) | |||
| encoding_time = time.time() - encoding_start | |||
|
|
|||
| db_query_start = time.time() | |||
|
|
|||
| # Django QuerySets are lazily evaluated | |||
| if user.is_authenticated: | |||
| # User sees their own files + files uploaded by superusers | |||
| closest_embeddings_query = ( | |||
| Embeddings.objects.filter( | |||
| Q(upload_file__uploaded_by=user) | Q(upload_file__uploaded_by__is_superuser=True) | |||
| ) | |||
| .annotate( | |||
| distance=L2Distance("embedding_sentence_transformers", embedding_message) | |||
| ) | |||
| .order_by("distance") | |||
| queryset = Embeddings.objects.filter( | |||
| Q(upload_file__uploaded_by=user) | Q(upload_file__uploaded_by__is_superuser=True) | |||
| ) | |||
| else: | |||
| # Unauthenticated users only see superuser-uploaded files | |||
| closest_embeddings_query = ( | |||
| Embeddings.objects.filter(upload_file__uploaded_by__is_superuser=True) | |||
| .annotate( | |||
| distance=L2Distance("embedding_sentence_transformers", embedding_message) | |||
| ) | |||
| .order_by("distance") | |||
| ) | |||
| queryset = Embeddings.objects.filter(upload_file__uploaded_by__is_superuser=True) | |||
|
|
|||
| queryset = ( | |||
| queryset | |||
| .annotate(distance=L2Distance("embedding_sentence_transformers", embedding_vector)) | |||
| .order_by("distance") | |||
| ) | |||
|
|
|||
| # Filtering to a document GUID takes precedence over a document name | |||
| if guid: | |||
| closest_embeddings_query = closest_embeddings_query.filter( | |||
| upload_file__guid=guid | |||
| ) | |||
| queryset = queryset.filter(upload_file__guid=guid) | |||
| elif document_name: | |||
| closest_embeddings_query = closest_embeddings_query.filter(name=document_name) | |||
| queryset = queryset.filter(name=document_name) | |||
|
|
|||
| # Slicing is equivalent to SQL's LIMIT clause | |||
| closest_embeddings_query = closest_embeddings_query[:num_results] | |||
| return queryset[:num_results] | |||
There was a problem hiding this comment.
build_query() introduces/relocates important filtering + precedence logic (authenticated vs unauthenticated visibility; guid-over-document_name; LIMIT slicing), but the new tests only cover evaluate_query and log_usage. Add unit/integration tests covering build_query behavior (e.g., guid precedence and the authenticated/unauthenticated queryset filters) to prevent regressions in access control and filtering.
| def ready(self): | ||
| from .services.sentencetTransformer_model import TransformerModel | ||
| TransformerModel.get_instance() |
There was a problem hiding this comment.
ApiConfig.ready() will only run if this AppConfig is actually used by Django. Right now INSTALLED_APPS appears to include just "api" (not "api.apps.ApiConfig"), and api/__init__.py doesn’t set a default config, so this preload hook may never execute. Consider updating INSTALLED_APPS to reference api.apps.ApiConfig (or otherwise ensuring this config is selected) so the model is preloaded as intended.
| def ready(self): | ||
| from .services.sentencetTransformer_model import TransformerModel | ||
| TransformerModel.get_instance() |
There was a problem hiding this comment.
Calling TransformerModel.get_instance() unconditionally in ready() will run for every Django startup context (tests, migrations, management commands, autoreload) and can trigger a large model download/init even when no web traffic will be served. Consider gating this preload behind an explicit env flag (or limiting it to the web server entrypoint) to avoid slowing/fragilizing CI and one-off management commands.
| run: pip install -r server/requirements.txt | ||
| # Discover and run all files matching test_*.py or *_test.py under server/ | ||
| - name: Run tests | ||
| run: pytest server/ -v |
There was a problem hiding this comment.
pytest.ini was added under server/, but this workflow runs pytest server/ -v from the repo root. Pytest won’t automatically discover config files in subdirectories, so DJANGO_SETTINGS_MODULE/pythonpath may not be applied and Django tests can fail to initialize. Fix by either cd server && pytest -v or running pytest -c server/pytest.ini server/ -v (or setting DJANGO_SETTINGS_MODULE in the workflow env).
| run: pytest server/ -v | |
| run: pytest -c server/pytest.ini server/ -v |
Description
Related Issue
GitHub Issue #441
Manual Tests
Automated Tests
Documentation
Updated README with instructions for running backend tests
Reviewers
@taichan03 @amahuli03
Notes